26 research outputs found

    On identifiability of MAP processes

    Get PDF
    Two types of transitions can be found in the Markovian Arrival process or MAP: with and without arrivals. In transient transitions the chain jumps from one state to another with no arrival; in effective transitions, a single arrival occurs. We assume that in practice, only arrival times are observed in a MAP. This leads us to define and study the Effective Markovian Arrival process or E-MAP. In this work we define identifiability of MAPs in terms of equivalence between the corresponding E-MAPs and study conditions under which two sets of parameters induce identical laws for the observable process, in the case of 2 and 3-states MAP. We illustrate and discuss our results with examples

    Inference for double Pareto lognormal queues with applications

    Get PDF
    In this article we describe a method for carrying out Bayesian inference for the double Pareto lognormal (dPlN) distribution which has recently been proposed as a model for heavy-tailed phenomena. We apply our approach to inference for the dPlN/M/1 and M/dPlN/1 queueing systems. These systems cannot be analyzed using standard techniques due to the fact that the dPlN distribution does not posses a Laplace transform in closed form. This difficulty is overcome using some recent approximations for the Laplace transform for the Pareto/M/1 system. Our procedure is illustrated with applications in internet traffic analysis and risk theory

    Bayesian analysis of the stationary MAP2

    Get PDF
    In this article we describe a method for carrying out Bayesian estimation for the two-state stationary Markov arrival process (MAP(2)), which has been proposed as a versatile model in a number of contexts. The approach is illustrated on both simulated and real data sets, where the performance of the MAP(2) is compared against that of the well-known MMPP2. As an extension of the method, we estimate the queue length and virtual waiting time distributions of a stationary MAP(2)/G/1 queueing system, a matrix generalization of the M/G/1 queue that allows for dependent inter-arrival times. Our procedure is illustrated with applications in Internet traffic analysis.Research partially supported by research grants and projects MTM2015-65915-R, ECO2015- 66593-P (Ministerio de Economía y Competitividad, Spain) and P11-FQM-7603, FQM-329 (Junta de Andalucía, Spain). The authors thank both the Associate Editor and referee for their constructive comments from which the paper greatly benefited

    Non-identifiability of the two state Markovian Arrival process

    Get PDF
    In this paper we consider the problem of identifiability of the two-state Markovian Arrival process (MAP2). In particular, we show that the MAP2 is not identifiable and conditions are given under which two different sets of parameters, induce identical stationary laws for the observable process

    Robust newsvendor problem with autoregressive demand

    Get PDF
    This paper explores the classic single-item newsvendor problem under a novel setting which combines temporal dependence and tractable robust optimization. First, the demand is modeled as a time series which follows an autoregressive process AR(p), p ≥ 1. Second, a robust approach to maximize the worst-case revenue is proposed: a robust distribution-free autoregressive forecasting method, which copes with non-stationary time series, is formulated. A closed-form expression for the optimal solution is found for the problem for p = 1; for the remaining values of p, the problem is expressed as a nonlinear convex optimization program, to be solved numerically. The optimal solution under the robust method is compared with those obtained under two versions of the classic approach, in which either the demand distribution is unknown, and assumed to have no autocorrelation, or it is assumed to follow an AR(p) process with normal error terms. Numerical experiments show that our proposal usually outperforms the previous benchmarks, not only with regard to robustness, but also in terms of the average revenue.Ministerio de Economía y CompetitividadJunta de Andalucí

    A sparsity-controlled vector autoregressive model

    Get PDF
    Vector autoregressive (VAR) models constitute a powerful and well studied tool to analyze multivariate time series. Since sparseness, crucial to identify and visualize joint dependencies and relevant causalities, is not expected to happen in the standard VAR model, several sparse variants have been introduced in the literature. However, in some cases it might be of interest to control some dimensions of the sparsity, as e.g. the number of causal features allowed in the prediction. To authors extent none of the existent methods endows the user with full control over the different aspects of the sparsity of the solution. In this paper we propose a sparsity-controlled VAR model which allows to control different dimensions of the sparsity, enabling a proper visualization of potential causalities and dependencies. The model coefficients are found as the solution to a mathematical optimization problem, solvable by standard numerical optimization routines. The tests performed on both simulated and real-life multivariate time series show that our approach may outperform both the standard and Group Lasso in terms of prediction errors specially when highly sparse graphs are sought, while avoiding the VAR’s overfitting for more dense graphs. Causality; Mixed Integer Non Linear Programming; multivariate time series; sparse models; Vector autoregressive process.Ministerio de Econom´ıa y CompetitividadJunta de Andalucí

    Variable selection for Naive Bayes classification

    Get PDF
    The Naive Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. However, features are usually correlated, a fact that violates the Naive Bayes' assumption of conditional independence, and may deteriorate the method's performance. Moreover, datasets are often characterized by a large number of features, which may complicate the interpretation of the results as well as slow down the method's execution. In this paper we propose a sparse version of the Naive Bayes classifier that is characterized by three properties. First, the sparsity is achieved taking into account the correlation structure of the covariates. Second, different performance measures can be used to guide the selection of features. Third, performance constraints on groups of higher interest can be included. Our proposal leads to a smart search, which yields competitive running times, whereas the flexibility in terms of performance measure for classification is integrated. Our findings show that, when compared against well-referenced feature selection approaches, the proposed sparse Naive Bayes obtains competitive results regarding accuracy, sparsity and running times for balanced datasets. In the case of datasets with unbalanced (or with different importance) classes, a better compromise between classification rates for the different classes is achieved.This research is partially supported by research grants and projects MTM2015-65915-R (Ministerio de Economia y Competitividad, Spain) and PID2019-110886RB-I00 (Ministerio de Ciencia, Innovacion y Universidades, Spain) , FQM-329 and P18-FR-2369 (Junta de Andalucia, Spain) , PR2019-029 (Universidad de Cadiz, Spain) , Fundacion BBVA and EC H2020 MSCA RISE NeEDS Project (Grant agreement ID: 822214) . This support is gratefully acknowledged. Documen

    Cost-sensitive feature selection for support vector machines

    Get PDF
    Feature Selection (FS) is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf FS procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based FS procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines (SVM), accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for SVM with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed FS procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved

    Constrained Naïve Bayes with application to unbalanced data classification

    Get PDF
    The Naïve Bayes is a tractable and efficient approach for statistical classification. In general classification problems, the consequences of misclassifications may be rather different in different classes, making it crucial to control misclassification rates in the most critical and, in many realworld problems, minority cases, possibly at the expense of higher misclassification rates in less problematic classes. One traditional approach to address this problem consists of assigning misclassification costs to the different classes and applying the Bayes rule, by optimizing a loss function. However, fixing precise values for such misclassification costs may be problematic in realworld appli cations. In this paper we address the issue of misclassification for the Naïve Bayes classifier. Instead of requesting precise values of misclassification costs, threshold val ues are used for different performance measures. This is done by adding constraints to the optimization problem underlying the estimation process. Our findings show that, under a reasonable computational cost, indeed, the performance measures under con sideration achieve the desired levels yielding a user-friendly constrained classification procedure
    corecore